Immigration related debates in UK HOC

Amir Firestone (192245), Ofer Dotan (195169), Jan Panhuysen (191679)

20/12/2020

Abstract

A text analysis of UK House of Commons (HoC) parliamentary debates from 2010 to 2020 reveals that discussions relating to immigration meaningfully increased following the general election in 2015 and leading to Brexit a time which also coincided with the European migration crisis. By performing a topic model analysis, it becomes evident that migration-related discussions in the HoC revolved around four central themes – social welfare, economic migration, refugees fleeing conflict and the humanitarian concerns of migration. The relative prevalence of these topics in party-specific contributions changed over the years in correlation with significant global and national events. Topic-related sentiments also help identify trends in migration-related debates in the HoC. While the UK’s five largest political parties exhibit positive sentiment toward the topic of economic migration, they are found to overwhelmingly present the humanitarian concerns of migration in a negative frame. These findings suggest the potential of using these methods to inform coalition building on issues related to immigration.

1. Introduction

Like most world economies, the UK is facing a challenging period in the wake of the Coronavirus pandemic. During these already challenging times, immigration is likely to become a sensitive topic in policymaking. Particularly so, given that the UK’s exit of the European Union, which is only now beginning to come into effect, was driven in part by the desire for immigration-related legislation to be made on UK’s terms.

A multitude of debates relate to that issue. While many of those who voted “leave” were most concerned about the number of immigrants coming into Britain (Sapsted, 2020), others argue that preventing the immigration of skilled workers is likely to hinder the country’s economic recovery due to gaps in currently-high-demand-jobs (Grierson, 2020). In fact, the COVID-19 outbreak has brought to attention just how much the UK’s healthcare and social care systems depend on workers who originally came from abroad (Sapsted, 2020). In any case, regardless of how circumstances develop in the UK and other countries, some people will still leave their homes in search of a better future. Hence, debates on immigration will likely remain of core parliamentary importance.

As discussions in the House of Commons (HoC) are instrumental to the unfolding of policies and pieces of legislation on immigration, understanding politicians’ contributions to these debates can provide valuable insights into how certain policies came into effect. It is therefore impotant to explore how parliamentary debates on immigration have changed over the years leading up to the 2015 general election, during the Syrian refugee crisis and since the Brexit referendum.

Covering this exact time period and building on an approach of using speech data to investigate themes and related sentiments in parliamentary debates (Bara, Weale & Bicuelet, 2007), our analysis investigates the most prevalent topics related to immigration and parties’ respective attitudes towards them.

With this, we aim at capturing changes in the prevalence of migration-related debates in parliament, how different parties cover such topics over time, and what sentiment is associated with specific topics of immigration. Comprehensive insights in this direction, for which this study provides guidance, could help policymakers develop actionable strategies for coalition building around immigration-related policies.

2. Packages and Data

Our exploration is based on HoC speech data from a database called ParlSpeech V2 by Christian Rauh and Jan Schwalbach (2020). This dataset is unique in its scope, covering all parliamentary debates from 1998 and up until 2020, resulting in 1,956,223 speeches (Rauh & Schwalbach, 2020, p. 10). Speeches represent individual contributions by members of parliament and were collected from the digital Commons Hansard, which contains the plenary protocols and documents from which speech text and metadata are extracted. As a result, the corpus includes a range of covariates like party affiliation and agenda, which facilitate a detailed set of contextual and party-specific analyses. In our analysis, we make use of these metadata and leverage the Lexicoder 2015 sentiment dictionary, which was established to produce reliable estimates based on 2,858-word patterns relating to negative sentiment and 1,709-word patterns indicating positive sentiment (Young & Soroka, 2012). This sentiment dictionary is particularly relevant to our purposes as it was designed to analyze sentiment in political language of legislative speech and has been applied specifically to migration discourse (Heidenreich et al., 2020).

3. Subset & Method

In a first step, we choose to focus on texts from 2010 to present day, leaving us with slightly fewer than 750,000 individual contributions. 2010 is a good starting point for our analysis because that was the year of the Tory manifesto and the general elections. This sets a sufficient time frame that has observations both before our main events of interest—namely the 2015 General Election, the Syrian refugee crisis, and the Brexit Referendum—and after, from 2016 until 2020. In terms of content, we subset the corpus to those contributions that either contain a reference to keywords related to our topic of immigration, or that were made in response to agenda points that contain such keywords. The keywords we used were “I/immigra*”, “R/refuge*” and “A/asylum.” In line with existing research, we expect parliamentary debates to be explicit in their language, meaning that if immigration is discussed, one of these keywords will show either in the agenda description or in the speech itself (Van Dijk, 2000). This would allow us to capture most of the substantive debates regarding immigration. Finally, we exclude speeches that were shorter than 10 words and select only the contributions of the five parties that made the most overall contributions to remove noise from unrelated text. These were the Conservatives, Labour, Liberal Democrats, SNP and DUP. The downside of such subsetting is that we lose documents that discuss immigration without mentioning the three keywords chosen in either agenda description or text.

The aforementioned steps yield a final subset of 22,257 individual contributions, representing about 3% of the parliament’s overall debates during that period as well as about 6.25% of the overall time spent in debates. The five parties selected represent about 98% of the overall contributions made.

To offer a blueprint, we use both a general subset of the HoC parliamentary debates described above and a more targeted subset of text immediately surrounding selected keywords. Essentially, this subset contains bubbles of text within a range of 20 words before and after a keyword and without repeating two keywords when they are presented within the same string. We created this subset in order to pursue a deeper analysis of how these terms are used in context.

4. Descriptive Findings

A first analysis of the data reveals a rapid increase in discussion about immigration following the general elections in 2015, which also correlates with the progression of increasing immigration across the EU. Technically speaking, the analysis depicts the count of individual contributions made irrespective of length. By looking at the overall agenda points devoted or somehow related to immigration it shows that those have almost tripled between 2010 and 2020 with a nearly linear increase over the years.

Figure 1: Prevalence of immigration debates over time, using the number of words as a proxy for time spent on debating.

More comprehensively, Figure 1 uses the sum of words used in debates as an indicator of the time spent on the respective debate. Considering that the HoC only has a limited time available to discuss agenda points, devoting more time towards a debate may indicate certain priorities. In the months before January 2018, where there seems to be a decrease in debates, there were several holidays including summer holiday, an external conference, November break and Christmas holiday. To even-out spikes and breaks, which are likely due to the different recess dates within the HoC (UK Parliament, 2020), the plot depicts the 6-month-average total number of words spent on immigration-related debates. By looking at averages in this way, it is possible to observe whether debate-preferences prevailed over time or whether they only peaked for short periods.

From January 2012 to November 2014, we observe a steady increase in time (averaged over 6-months) spent on debates related to immigration. This is likely due to the monthly spikes in both January and June of 2014. The second half of 2014 as well as the first half of 2015 saw less time being devoted to immigration related debates. This suggest that overall, the content of our subset did not increase in prevalence before the 2015 General Election.

However, between May 2015 and June 2016, the year following the general election and leading up to the Brexit referendum, there was a major increase in time spent on immigration-related debates. On average, the HoC spent almost twice as much time on immigration related debates during Sep 2015 - February 2016 when compared to the period of December 2014 - May 2015. This seems to indicate that immigration-related debates gained priority after the General election and leading up to the referendum.


Figure 2: Concentration of party-specific contributions.

Zooming in further, while the SNP and the DUP were deliberating immigration related topics more frequently after Brexit, other parties had a more constant trend of engagement with immigration-related speech. Importantly, the information that can be gathered from this density plot is limited in that it does not tell us anything about the substance of these speeches, but crudely how many words were used to discuss them. Nevertheless, this descriptive visualization does help us get an initial sense of the prevalence of immigration related speech produced by each of the parties we are focusing on.

5. Topical Analysis

In order to better understand developments in migration-related debates over time and the different party positions in these debates, it is important to distinguish these debates by the themes they discuss. To do this, we use the Structural Topic Model (STM).

Using the STM package in R, we model 6 topics from the content of migration-related documents and assign each document a theta score for each topic. These scores represent the proportions of prevalence of each topic for each document. Next, we test how exclusive and coherent these topics are. While 6 topics may seem few, we are modelling topics from a subset of parliamentary debates that use migration-related keywords. This already limits the extent of topics potentially covered by these debates. Our test also shows that 6 is enough for allocating sufficiently exclusive topics.

We combine the topic scores of each document to our dataset of migration-related debates. Next, we attribute to each topic a name based on the first 3 FREX terms, the words that are more frequent and most exclusive to each topic. One topic labelled “allowance, tax, dwp,” likely refers to content on migration that related to social welfare. The topic named “eea, seasonal, visa” relates to the portion of debates that address the economic dimensions of migration. “Iraq, Rohingya, Libya” describes the portion of debates that concern refugees fleeing conflict. The topic “tb, Sikh, Auschwitz” seems to indicate a collection of less prevalent migration-related topics such as Tuberculosis, specific migrant communities, and the Holocaust. The title “unaccompanied, trafficked, detention” describes the most vulnerable populations of migrants and related humanitarian concerns. Finally, the topic “vote, voting, motion” includes the procedural vocabulary of the House of Parliament.


Figure 3: Topics by party.

In figure 3, we plot the proportions of topics covered by each party over all the years of debate (2010-2020). This plot illustrates two main things. First, it demonstrates the relative prevalence of these six migration-related topics in parliamentary debates. From this we can see that aside from procedural vocabulary, the three most prevalent topics in these migration debates are economic migration, refugees fleeing conflict, and the humanitarian concerns of migration.

In addition, this plot compares how different parties discuss migration in terms of these six topics. Here we see that the Conservatives discuss economic migration more than any other major party and to a greater extent than they discuss refugees fleeing conflict or the humanitarian concerns of migration. In comparison to the Conservatives, the Labour party discusses migration more often in the context of social welfare and humanitarian concerns.


Figure 4: Topic prevalence across time by party.

To explore the change in topic prevalence over time, Figure 4 plots the yearly average prevalence of each topic between 2010 and 2020 (solid black line). This plot shows that while the topic of social welfare enjoys little prevalence in migration-related debates, it was most prevalent in the earlier years of the decade and has begun regaining attention in recent years. We also find that the topic of economic migration experienced a sharp decline in attention after 2013 and has only risen to prominence once again in post-BREXIT debates. The topic of refugees fleeing conflict first received attention in 2011, during the civil war in Libya, before becoming nearly three-times more prevalent at the peak of the Syrian refugee crisis in 2015. The topic lost traction in 2016 after the BREXIT referendum shifted political attention to other concerns regarding migration. Humanitarian concerns reached their highest level of prevalence in debates in 2016. This is likely due to the increased public awareness of the victims of migration following the peak of Europe’s migration crisis and Mediterranean crossings (UNHCR, 2020).

Next, we can explore the prevalence of the topics by party over these ten years (2010-2020). In addition to showing party-specific trends in topical focus over time, Figure 4 also shows that the topic coverage of parties converges in 2015 and 2016. Three explanations may shed light to why we see this trend. The first relates to significant external developments that are relevant to UK national interests, such as the Syrian refugee crisis in Europe. As a result, certain topics related to migration enter the agenda as large-scale national issues that are relevant to all parties. Secondly, as usually occurs towards general election, all parties discuss broadly similar agenda points which are found at the core of the political discourse. Lastly, the Brexit referendum in 2016 drastically changed the context for many kinds of policies on a national level. This means that some topics of migration, such as economic migration, become important for the whole UK and all parties.

6. Sentiment

Using the Lexicoder 2015 sentiment dictionary, average sentiment scores are computed for each document. We map sentiment scores by party throughout the years and find several trends that are worth mentioning. Interestingly, the Conservative party exhibits an overall more positive sentiment that increases gradually over the years compared to the Labour party. Further, the Liberal democratic party (LibDem), the Scottish National Party (SNP) and the Democratic Unionist Party (DUP) exhibit a more fluctuating sentiment toward immigration related topics. Specifically, before the general election, both SNP and DUP had more negative sentiment toward immigration.

Figure 5: Observed sentiment by party.

However, we face a challenge when interpreting party-sentiment trends because the sentiment score combines both the sentiment associated with the subjects of these debates as well as the words used that present speakers’ attitudes on these subjects. For example, an MP who is supportive of helping victims of human trafficking may mention specific phrases, such as “human trafficking,” which are associated with negative sentiment. This presents a key challenge for interpreting sentiment simply as attitude towards migration and highlights the importance of disentangling these effects.

Figure 6: Observed sentiment in the context of keywords.

Exploring the sentiment of text surrounding specific keywords, we see that, overall, sentiment becomes more positive over time. This plot also shows that while text surrounding the mention of immigration has relatively low sentiment, sentences that mention refugees have more positive sentiment.

To explore the sentiment related to specific types of migration-related debate in more depth, we apply our sentiment estimates to the topics generated by our topic model. To do this, we calculate the correlation of sentiment scores and topic scores for all documents in the subset.

##                                       correlations
## iraq, rohingya, libya                -0.0127621260
## unaccompanied, trafficked, detention -0.0983944435
## tb, sikh, auschwitz                  -0.0429929832
## vote, voting, motion                 -0.0006179458
## eea, seasonal, visa                   0.1478371098
## allowance, tax, dwp                  -0.0125341125

Table 1: Correlation of sentiments and topics.

The result presents estimates of correlation between positive sentiment and each topic category. The topic of economic migration shows the strongest correlation with positive sentiment (0.148). This makes substantive sense, as many sectors of the UK benefit economically from migration. Discussion about economic migration in parliament would therefore enjoy a generally positive sentiment. In contrast, discussion related to the humanitarian concerns of migration show a strong, inverse correlated with positive sentiment (-0.098). This means that the increased prevalence of this topic in debate is paired with increasingly negative sentiment. Substantively speaking, this might indicate that members of parliament are overall concerned about the humanitarian risks of migration and wish to avoid or prevent them. Alternatively, this estimate may have nothing to do with how MPs discuss the topic of humanitarian risks and instead be showing the sentiment of words that are the subject matter of humanitarian risks, which are probably associated with negative sentiment. The topic containing procedural vocabulary is correlated with neutral sentiment (-0.002). As these words contain no valuable content, this makes sense and affirms our understanding of these topic-sentiment correlation estimates. The topics of social welfare and refugees fleeing conflict both show slightly negative correlations with sentiment (both are -0.012), possibly indicating mixed stances on these topics in parliament that overall verge negative. While the topic titled “tb, sikh, Auschwitz” has a score of fairly strong negative sentiment correlation, it is difficult to make any conclusions on this because the topic is inconsistent and contains multiple subjects of debate.

Looking for a statistically questionable, and frankly undergraduate-level regression to mislead your favorite policy makers? Simply refer to the topic-section in the appendix, where we included some seductive, prettily colored graphs.

7. Conclusion & Shortcomings

Our exploratory study presents the value of using topic models and sentiment analysis to inform coalition building for policymakers working on immigration-related policy. This is particularly valuable today, in the wake of the COVID-19 crisis and Britain’s exit from the EU becoming a reality, as the importance of immigration—specifically economic immigration—is likely to increase dramatically in parliament.

Testing these methods, we learn that immigration-related debates became increasingly prevalent over the years between 2010 and 2020, with a massive increase from 2015 onwards. This trend is correlated to an increase in discussions over refugees fleeing conflict areas and the humanitarian concerns associated with these events. Furthermore, we show that following the results of the Brexit referendum, attention to refugees decreased relative to the topic of economic migration, which suddenly attracted more attention. We also find that while economic migration is framed in more positive language, the humanitarian concerns of migration are correlated with negative sentiment.

However, a more fine-tuned approach is necessary for extracting meaningful insights for coalition building around immigration policy. This is especially important in terms of focusing sentiment analysis on language that expresses party attitudes towards migration-related topics, rather than the words that capture the subject matter of these topics. Thus, further research should pose more specific questions to differentiate these two elements of sentiment. This more targeted approach is likely to yield more accurate estimations.

References

Bara, J., Weale, A., & Bicquelet, A. (2007). Analysing parliamentary debate with computer assistance. Swiss Political Science Review, 13(4), 577-605.

Grierson, J. (2020). Post-Brexit key worker shortage ‘may hamper UK economic recovery’. The Guardian. https://www.theguardian.com/politics/2020/dec/15/post-brexit-key-worker-shortage-may-hamper-uk-economic-recovery

Heidenreich, T., Eberl, J. M., Lind, F., & Boomgaarden, H. (2020). Political migration discourses on social media: a comparative perspective on visibility and sentiment across political Facebook accounts in Europe. Journal of Ethnic and Migration Studies, 46(7), 1261-1280.

UK Parliament. (2020). List of previous Commons Recess Dates. https://www.parliament.uk/about/faqs/house-of-commons-faqs/business-faq-page/recess-dates/list-of-previous-commons-recess-dates/

UNHCR. (2020). Mediteranian Situation. https://data2.unhcr.org/en/situations/mediterranean

Sabsted, D. (2020). Is Brexit Britain pro-immigration and what does it mean for business? Relocate Global. https://www.relocatemagazine.com/articles/enterprise-is-brexit-britain-pro-immigration-and-what-does-it-mean-for-business-dsapsted-au20

Young, L. & Soroka, S. (2012). Affective News: The Automated Coding of Sentiment in Political Texts. Political Communication, 29(2), 205–231.

Appendix

Further descriptive insights

Further insights on sentiment & topics

The underlying code of the following figures tries to investigate the relation between sentiments and topic, and how that changes by parties. This does not aim to show causal relations but to provide the reader with an intuition how sentiment regarding different topics vary between parties, and, depending on the topic discussed.

## 
## ===========================================================================================
##                                                     Dependent variable:                    
##                                  ----------------------------------------------------------
##                                                          sentiment                         
##                                    (1)       (2)        (3)      (4)       (5)       (6)   
## -------------------------------------------------------------------------------------------
## theta                            -0.051*  -0.423***  -0.287***  -0.003   0.625***  -0.067* 
##                                  (0.027)   (0.029)    (0.045)  (0.030)   (0.028)   (0.036) 
##                                                                                            
## Constant                         0.666***  0.735***  0.685***  0.659***  0.539***  0.667***
##                                  (0.007)   (0.008)    (0.007)  (0.010)   (0.008)   (0.008) 
##                                                                                            
## -------------------------------------------------------------------------------------------
## Observations                      22,257    22,257    22,257    22,257    22,257    22,257 
## R2                                0.0002    0.010      0.002   0.00000    0.022     0.0002 
## Adjusted R2                       0.0001    0.010      0.002   -0.00004   0.022     0.0001 
## Residual Std. Error (df = 22255)  0.939     0.934      0.938    0.939     0.929     0.939  
## F Statistic (df = 1; 22255)       3.625*  217.567*** 41.212***  0.008   497.269***  3.497* 
## ===========================================================================================
## Note:                                                           *p<0.1; **p<0.05; ***p<0.01

output table for lm models

plot correlation between sentiment and topic

plot correlation between sentiment and topic by party